cd/entity/CORE-Bench v1.1Β· homeβ€Ί entitiesβ€Ί CORE-Bench v1.1
grep -l @core-bench v1.1 /news/*.json | wc -l β†’ 1

CORE-Bench v1.1

mentions 1 type Person feed RSS

// recent coverage 1 mentions

04:00
2026-06-26
arxiv.org
artificial-intelligence

Life After Benchmark Saturation: A Case Study of CORE-Bench

Researchers at arXiv propose a multi-dimensional evaluation framework for AI agents beyond accuracy saturation, using CORE-Bench Hard as a case study. They introduce CORE-Bench v1.1 and an out-of-dist…

// co-occurs with top 4 entities